An example of overfitting, but it has an intuitive result


In [1]:
%pylab inline
from classy import *
from pylab import colorbar


Populating the interactive namespace from numpy and matplotlib
Keras not installed
Version:  0.0.19

In [2]:
d={}
for i in range(10):
    d['%d' % i]='data/simple_digits/%d.png' % i

images=image.load_images_from_filepatterns(**d)
images.target_names=[int(_) for _ in images.target_names]


[0]: 1 files found
	data/simple_digits/0.png
[1]: 1 files found
	data/simple_digits/1.png
[2]: 1 files found
	data/simple_digits/2.png
[3]: 1 files found
	data/simple_digits/3.png
[4]: 1 files found
	data/simple_digits/4.png
[5]: 1 files found
	data/simple_digits/5.png
[6]: 1 files found
	data/simple_digits/6.png
[7]: 1 files found
	data/simple_digits/7.png
[8]: 1 files found
	data/simple_digits/8.png
[9]: 1 files found
	data/simple_digits/9.png

In [3]:
images.data[0].shape


Out[3]:
(16, 12)

In [4]:
data=image.images_to_vectors(images)


10 vectors of length 192
Feature names: 'p0', 'p1', 'p2', 'p3', 'p4'  , ... ,  'p187', 'p188', 'p189', 'p190', 'p191'  (192 features)
Target values given.
Target names: '0', '1', '2', '3', '4', '5', '6', '7', '8', '9'
Mean:  [  8.5  23.   54.5  95.2 128.1 142.3 140.  127.5 101.1  65.5  31.2  14.1
  18.5  45.3  98.3 152.4 190.4 205.3 201.3 187.4 162.3 116.6  61.9  28.2
  41.3  84.7 151.3 197.4 214.9 214.5 206.8 204.4 206.  176.4 106.7  54.5
  67.8 117.7 173.1 177.7 152.9 129.  119.8 135.  175.3 190.9 137.3  79.8
  87.  136.7 168.9 131.7  80.7  55.6  51.6  71.4 128.  175.6 145.3  92.6
  89.4 136.6 155.  104.2  56.4  46.2  52.9  70.7 118.5 162.9 136.8  87.6
  79.8 123.1 143.6 109.3  83.8  90.3 106.  123.1 151.8 167.4 125.1  75.2
  69.  108.  136.8 128.3 127.3 146.3 166.9 179.9 191.9 180.9 119.6  65.9
  60.1  93.6 121.  121.1 131.1 152.  170.1 179.7 189.  181.2 122.2  67.7
  54.8  81.7  96.3  86.3  91.6 109.7 122.4 130.5 152.6 168.  128.8  77.8
  58.5  84.2  91.   67.7  62.9  77.3  82.8  87.8 119.9 155.  130.2  84.1
  66.   98.1 114.1  90.8  77.3  81.8  82.1  86.3 120.3 152.5 124.9  79.9
  59.7  98.7 136.6 137.6 132.  132.8 131.  134.7 154.2 156.9 111.4  64.4
  37.6  73.4 126.2 162.6 183.6 195.7 193.3 184.8 174.3 141.8  84.2  41.9
  15.7  38.   82.  128.6 164.7 183.1 180.5 162.5 133.3  92.   47.6  20.1
   6.2  17.2  44.8  80.5 112.5 127.4 124.8 107.1  79.7  49.3  22.3   8.5]
Median:  [  1.   15.   57.5 111.5 140.5 147.  148.  140.5 111.5  64.   16.    2.
  15.   47.  113.5 181.5 208.  210.  215.5 205.5 180.  122.5  51.5  15.
  49.5 104.5 190.5 223.5 221.  223.5 223.5 219.5 221.  193.  115.   57.5
  72.5 132.5 215.5 199.5 151.5 118.5 111.  131.5 196.5 218.5 164.5  98.
 100.5 158.5 219.5 142.   62.5  28.   28.   60.5 141.5 217.  158.   98.
 101.5 168.5 195.  118.5  41.5  18.5  21.5  52.  124.  209.5 147.   84.
  79.  144.5 211.  118.5 104.5  95.5 105.5 128.  176.  214.5 149.5  79.5
  66.  129.  200.5 146.5 149.5 196.5 202.  215.  211.  212.  138.   66.5
  48.   87.5 146.5 135.  161.  178.  198.  214.  214.5 222.  143.   69.5
  25.   39.5  63.5  82.   89.5 108.  112.  128.  166.  221.  153.   84.
  29.5  42.5  57.   47.5  34.   27.5  28.5  82.  124.  213.  153.   84.
  50.5  85.5 126.5 120.   59.   28.   28.   65.  134.  215.  146.5  76.5
  71.  123.5 192.5 185.5 142.  116.  120.  129.  190.5 214.  132.5  72.5
  50.  104.5 181.5 220.  219.5 222.  226.  217.  213.  190.5 100.5  50.
  13.5  39.   93.  146.5 186.5 206.5 212.  191.5 152.  113.5  47.   15.
   0.   11.   45.5  93.  121.  140.  145.  121.   91.5  56.   13.5   0. ]
Stddev:  [ 20.37277595  27.7055951   40.17026263  43.61146638  34.51796634
  25.80329436  34.19649105  37.641068    39.98862338  42.10522533
  34.99657126  24.39036695  25.5         38.55139427  59.37011032
  64.72279351  45.95693636  32.8239242   47.07663964  54.38786629
  58.07417671  58.30814694  48.18184305  34.2192928   28.94494775
  48.83042085  72.43900883  68.59037833  35.57372626  25.74975728
  45.30518734  48.61933772  64.74102254  66.11384121  50.43817998
  35.83085263  46.59141552  69.55868026  80.81392207  69.13616998
  31.52284886  43.48562981  48.48463674  18.87855927  66.90149475
  76.81594886  63.29936809  44.45177162  60.56896895  84.23781811
  86.87514029  67.93828081  44.70805297  68.44150787  68.37426416
  33.49985075  65.40030581  87.11280044  81.38187759  60.60726029
  62.20964555  86.8092161   88.92356268  65.1502878   44.68825349
  71.97610715  71.69163131  45.51713963  64.90030817  85.69766625
  84.0211878   63.84543836  61.60973949  88.85094259 101.20592868
  85.74153019  62.27648031  72.9288009   69.8140387   53.42929908
  66.18730996  81.58823445  79.00563271  57.67807209  64.02811882
  92.72000863 110.79783391 109.80532774  96.76161429  91.53911732
  78.7152463   66.81983239  76.0005921   75.53204618  67.83244062
  48.26686234  62.57068003  90.49220961 106.17155928 104.42360844
  89.79471031  80.99382693  71.90751004  66.13781067  77.67753858
  78.0433213   68.63497651  48.06672446  64.05279073  92.14450608
  98.55663347  72.80666178  50.51771966  68.82158092  76.93009814
  58.41446739  67.70406192  85.18098379  83.03830441  59.94797745
  68.1208485   95.69932079  95.17352573  53.86845088  51.76765399
  91.6024563   91.13703967  55.69703762  63.67644777  92.25074525
  96.50989587  72.80583768  66.85506712  95.1161921   98.10958159
  63.87613013  61.63773195  90.23059348  82.07977826  50.27534187
  71.29663386  99.16980387  98.25828209  72.14076517  52.64798192
  83.59431799 106.4238695   92.62310727  63.94059743  67.42373469
  58.44826772  40.75549043  79.80827025  95.65087558  78.51649508
  52.08493064  31.95997497  58.4982051   92.01282519 100.22494699
  80.34824205  68.18951532  69.09421105  69.31204801  85.69136479
  85.32854153  58.14086343  34.04247347  22.95234193  38.7453223
  62.54118643  77.54250447  72.60447645  67.8033185   73.61691382
  74.33068007  74.14991571  67.3141887   49.56046812  30.7520731
  17.28467529  26.40757467  38.95330538  48.76115257  49.58880922
  48.88598981  53.41872331  52.67722468  49.25048223  42.90699244
  32.93645397  21.03924904]

In [5]:
image.vector_to_image(data.vectors[4,:],(16,12))
colorbar()


Out[5]:
<matplotlib.colorbar.Colorbar at 0x1a1dcd2ba8>

In [6]:
data.vectors-=data.vectors.mean()
data.vectors/=data.vectors.std()
image.vector_to_image(data.vectors[4,:],(16,12))
colorbar()


Out[6]:
<matplotlib.colorbar.Colorbar at 0x1a1ddf4c18>

In [7]:
C=Perceptron()

In [8]:
timeit(reset=True)
C.fit(data.vectors,data.targets)
print("Training time: ",timeit())


Time Reset
Training time:  0.008205175399780273 seconds 

In [9]:
print("On Training Set:",C.percent_correct(data.vectors,data.targets))


On Training Set: 80.0

In [10]:
C.weights


Out[10]:
array([[ 0.54703502,  0.19036593, -0.29708182, ...,  1.31981804,
         1.36737392,  1.36737392],
       [ 0.52325708, -0.01174655, -1.05797587, ...,  0.55892399,
         1.16526144,  1.34359598],
       [ 1.84307512,  1.41507222, -0.18993867, ...,  4.5894271 ,
         5.04120794,  4.3516477 ],
       ...,
       [ 2.14015694,  2.31849149,  2.22337973, ...,  0.60647987,
         1.1414835 ,  1.31981804],
       [ 1.72418543,  0.97518035, -0.10671589, ...,  2.38996772,
         2.58019124,  2.66341402],
       [ 3.25800492,  2.85377996,  1.99777415, ...,  1.56977124,
         3.19856008,  3.91189825]])

In [11]:
C.weights.shape


Out[11]:
(10, 192)

In [12]:
C.output(data.vectors[2,:].reshape(1,-1))


Out[12]:
array([[ -61.56529126,  -77.34702196,   92.92704089, -128.3153225 ,
        -111.90751476,  -99.2861426 ,  -85.39152961,  -30.45245873,
        -113.40938449, -102.11371686]])

In [13]:
import matplotlib.pyplot as plt
plt.figure(figsize=(16,4))
for i in range(10):
    plt.subplot(2,10,i+1)
    image.vector_to_image(data.vectors[i,:],(16,12))
    plt.axis('off')
    
    plt.subplot(2,10,i+11)
    image.vector_to_image(C.weights[i,:],(16,12))
    plt.axis('off')



In [14]:
data.targets


Out[14]:
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], dtype=int32)

In [15]:
data.vectors.shape


Out[15]:
(10, 192)

In [16]:
plt.plot(C.output(data.vectors[2,:]).ravel(),'-o')


---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-16-6d772fdf0ee0> in <module>()
----> 1 plt.plot(C.output(data.vectors[2,:]).ravel(),'-o')

~/anaconda3/lib/python3.7/site-packages/classy/supervised.py in output(self, vectors)
    156 
    157     def output(self,vectors):
--> 158         return self.decision_function(vectors)
    159 
    160 

~/anaconda3/lib/python3.7/site-packages/sklearn/linear_model/base.py in decision_function(self, X)
    298                                  "yet" % {'name': type(self).__name__})
    299 
--> 300         X = check_array(X, accept_sparse='csr')
    301 
    302         n_features = self.coef_.shape[1]

~/anaconda3/lib/python3.7/site-packages/sklearn/utils/validation.py in check_array(array, accept_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, warn_on_dtype, estimator)
    439                     "Reshape your data either using array.reshape(-1, 1) if "
    440                     "your data has a single feature or array.reshape(1, -1) "
--> 441                     "if it contains a single sample.".format(array))
    442             array = np.atleast_2d(array)
    443             # To ensure that array flags are maintained

ValueError: Expected 2D array, got 1D array instead:
array=[-1.34359598 -1.20092835 -1.20092835 -0.77292544 -0.19036593  0.26141491
  0.38030461  0.02363552 -0.67781369 -1.20092835 -1.34359598 -1.34359598
 -1.34359598 -1.22470629 -0.89181514 -0.22603284  0.59430606  1.14119866
  1.29575526  0.93908618  0.04741346 -0.83237029 -1.22470629 -1.34359598
 -1.20092835 -0.89181514 -0.19036593  0.72508472  1.29575526  1.35520011
  1.29575526  1.29575526  0.96286412  0.08308037 -0.83237029 -1.20092835
 -0.77292544 -0.22603284  0.72508472  1.29575526  1.05797587  0.32085976
 -0.03580933  0.35652667  1.02230896  0.90341927 -0.10714315 -0.77292544
 -0.22603284  0.57052812  1.29575526  1.02230896  0.08308037 -0.77292544
 -1.04637174 -0.72536956  0.35652667  1.08175381  0.53486121 -0.24981078
 -0.16658799  0.57052812  0.80830751  0.04741346 -0.83237029 -1.22470629
 -1.31981804 -1.04637174 -0.03580933  1.02230896  0.78452957 -0.01203139
 -0.77292544 -0.44003429 -0.44003429 -0.89181514 -1.22470629 -1.31981804
 -1.22470629 -0.77292544  0.32085976  1.11742072  0.57052812 -0.22603284
 -1.2603732  -1.22470629 -1.22470629 -1.28415113 -1.28415113 -1.16526144
 -0.83237029 -0.07147624  0.93908618  1.05797587  0.08308037 -0.72536956
 -1.34359598 -1.34359598 -1.34359598 -1.34359598 -0.96314896 -0.49947914
  0.09496934  0.74886266  1.02230896  0.51108327 -0.49947914 -1.07014968
 -1.34359598 -1.34359598 -1.22470629 -0.89181514 -0.24981078  0.45163842
  0.93908618  0.93908618  0.45163842 -0.30925563 -0.96314896 -1.2603732
 -1.34359598 -1.22470629 -0.89181514 -0.19036593  0.53486121  0.90341927
  0.68941782  0.09496934 -0.49947914 -0.96314896 -1.22470629 -1.34359598
 -1.20092835 -0.89181514 -0.19036593  0.62997297  1.05797587  0.74886266
  0.02363552 -0.65403575 -0.96314896 -1.10581659 -1.20092835 -1.2603732
 -0.77292544 -0.22603284  0.65375091  1.29575526  1.35520011  0.86775236
  0.29708182 -0.03580933 -0.13092109 -0.19036593 -0.44003429 -0.77292544
 -0.34492254  0.35652667  1.27197732  1.68809126  1.68809126  1.48597878
  1.33142217  1.23631042  1.23631042  1.23631042  0.72508472 -0.07147624
 -0.38058945  0.26141491  1.02230896  1.29575526  1.29575526  1.29575526
  1.29575526  1.29575526  1.29575526  1.29575526  0.78452957 -0.03580933
 -0.65403575 -0.22603284  0.26141491  0.40408255  0.40408255  0.40408255
  0.40408255  0.40408255  0.40408255  0.38030461  0.02363552 -0.49947914].
Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.

In [17]:
C.output(data.vectors[2,:].reshape(1,-1)).shape


Out[17]:
(1, 10)

Backprop - even though we don't need it


In [ ]:
C=BackProp(hidden_layer_sizes = [4])
C.max_iter=5000
timeit(reset=True)
C.fit(data.vectors,data.targets)
print("Training time: ",timeit())

In [ ]:
print("On Training Set:",C.percent_correct(data.vectors,data.targets))

In [ ]:
C.layers_coef_[0].shape,C.layers_coef_[1].shape

In [ ]:
C.output(data.vectors[1,:].reshape(1,-1))

In [ ]: